
Principal Component Analysis (PCA) Basics Quiz
Created by Shiju P John ยท 9/22/2025
๐ Subject
Machine Learning
๐ Exam
Any
๐ฃ Language
English
๐ฏ Mode
Practice
๐ Taken
0 times
No. of Questions
10
Availability
Free
๐ Description
This quiz focuses on the fundamental concepts of Principal Component Analysis (PCA), a powerful dimensionality reduction technique. It covers the 'why' and 'how' of PCA, exploring its applications in data science, machine learning, and statistics. Questions delve into the core mathematical principles, including covariance matrices, eigenvectors, and eigenvalues, as well as practical considerations like data preprocessing, component interpretation, and selection. A solid understanding of PCA is crucial for handling high-dimensional data, improving model performance, and gaining insights from complex datasets.
\nKey formulas and concepts involved in PCA:
-
Covariance between two variables x and y: This measures how two variables change together. A positive covariance indicates that they tend to increase or decrease together, while a negative covariance indicates an inverse relationship.
Where and are the means of variables X and Y, respectively, and is the number of observations.
-
Covariance Matrix: For a dataset with multiple features, the covariance matrix summarizes the covariances between all pairs of features. If is a centered data matrix (mean of each column is 0) where columns are features and rows are observations:
The diagonal elements are the variances of each feature, and off-diagonal elements are the covariances between features.
-
Eigenvalue Problem: Principal components are derived by solving the eigenvalue problem for the covariance matrix (or correlation matrix, if data is scaled). For a square matrix (our covariance matrix):
Where is an eigenvector and is its corresponding eigenvalue. The eigenvectors represent the directions (principal components), and the eigenvalues represent the magnitude of variance along those directions.
-
Explained Variance Ratio for a Principal Component: This metric quantifies the proportion of the total variance in the dataset that is captured by a specific principal component.
Where is the eigenvalue corresponding to the -th principal component, and is the total number of features (or components). This ratio is crucial for selecting the number of components to retain.